Supervised Feature Subset Selection Based On Extended Fuzzy Relative Information Measure For Boundary Samples

نویسندگان

  • K. Sarojini
  • K. Thangavel
چکیده

Feature subset selection is an essential preprocessing task in data mining. This paper presents a new method called Extended Fuzzy Relative Information Measure for Boundary Samples (EFRIMBS) for dealing with supervised feature subset selection. The proposed algorithm uses boundary samples instead of full set of samples. First, Discretization algorithms such as K-Means, Fuzzy C Means and Median as Initial Centroid of K-Means are applied to discretize numeric features to construct the membership functions of each fuzzy sets of a feature. Then the proposed EFRIMBS is applied to select feature subset focusing on boundary samples. J.D.Shie and S.M.Chen’s fuzzy entropy measure (FE) for feature subset selection is also applied with different discretization algorithms and the results are compared with the proposed algorithm. The FE based feature selection algorithm is efficient only for smaller datasets, the proposed method is very efficient for small and large datasets which is selecting minimum number of features for feature subset. The experimental results for UCI datasets shows that the proposed method produces better results when compared with the existing one.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Supervised Feature Subset Selection using Extended Fuzzy Absolute Information Measure for Different Classifiers

Feature subset selection plays an important role in data mining and machine learning applications. The main aim of feature subset selection is reducing dimensionality by removing irrelevant and redundant features and improving classification accuracy. This paper presents a supervised feature selection method called as Extended Fuzzy Absolute Information Measure (EFAIM) for different classifiers...

متن کامل

A New Hybrid Feature Subset Selection Algorithm for the Analysis of Ovarian Cancer Data Using Laser Mass Spectrum

Introduction: Amajor problem in the treatment of cancer is the lack of an appropriate method for the early diagnosis of the disease. The chemical reaction within an organ may be reflected in the form of proteomic patterns in the serum, sputum, or urine. Laser mass spectrometry is a valuable tool for extracting the proteomic patterns from biological samples. A major challenge in extracting such ...

متن کامل

A hybrid filter-based feature selection method via hesitant fuzzy and rough sets concepts

High dimensional microarray datasets are difficult to classify since they have many features with small number ofinstances and imbalanced distribution of classes. This paper proposes a filter-based feature selection method to improvethe classification performance of microarray datasets by selecting the significant features. Combining the concepts ofrough sets, weighted rough set, fuzzy rough se...

متن کامل

Diagnosis of the disease using an ant colony gene selection method based on information gain ratio using fuzzy rough sets

With the advancement of metagenome data mining science has become focused on microarrays. Microarrays are datasets with a large number of genes that are usually irrelevant to the output class; hence, the process of gene selection or feature selection is essential. So, it follows that you can remove redundant genes and increase the speed and accuracy of classification. After applying the gene se...

متن کامل

Feature Selection Using Multi Objective Genetic Algorithm with Support Vector Machine

Different approaches have been proposed for feature selection to obtain suitable features subset among all features. These methods search feature space for feature subsets which satisfies some criteria or optimizes several objective functions. The objective functions are divided into two main groups: filter and wrapper methods.  In filter methods, features subsets are selected due to some measu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010